Fix indefinite hang on OpenAPI schemas with cyclic model dependencies#3078
Conversation
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (5)
🚧 Files skipped from review as they are similar to previous changes (5)
📝 WalkthroughWalkthroughStabilizes data-model ordering to avoid non-terminating permutations for circular base-class graphs, adds a guard to skip discriminator path validation when base-class references are missing, deduplicates union type-hint members, updates test dependencies, and adds OpenAPI/JSON Schema regression tests plus expected outputs. ChangesDependency ordering, discriminator handling, model typing, tests
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Tip 💬 Introducing Slack Agent: The best way for teams to turn conversations into code.Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.
Built for teams:
One agent for your entire SDLC. Right inside Slack. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Merging this PR will not alter performance
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/parser/test_base.py (1)
270-303: Consider asserting a postcondition, not just “no exception.”This catches the crash, but it would be a bit stronger if it also pinned one observable outcome after
__apply_discriminator_type()runs, so a future silent no-op doesn’t still pass.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/parser/test_base.py` around lines 270 - 303, The test currently only ensures no exception is raised; add a concrete postcondition after calling parser._Parser__apply_discriminator_type to ensure the union was not silently modified by the discriminator logic: e.g., assert that union_inner still contains exactly the two members referencing ref_pet and ref_other (check identities of union_inner.data_types entries or their .reference attributes) and that pet_model.base_classes remains unchanged — locate this in test_apply_discriminator_type_skips_base_class_without_reference and add the assertion(s) after the call to parser._Parser__apply_discriminator_type([root], Imports()).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/parser/test_base.py`:
- Around line 270-303: The test currently only ensures no exception is raised;
add a concrete postcondition after calling
parser._Parser__apply_discriminator_type to ensure the union was not silently
modified by the discriminator logic: e.g., assert that union_inner still
contains exactly the two members referencing ref_pet and ref_other (check
identities of union_inner.data_types entries or their .reference attributes) and
that pet_model.base_classes remains unchanged — locate this in
test_apply_discriminator_type_skips_base_class_without_reference and add the
assertion(s) after the call to parser._Parser__apply_discriminator_type([root],
Imports()).
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: af656bb8-01d5-4471-9440-e33695a9c92c
📒 Files selected for processing (2)
src/datamodel_code_generator/parser/base.pytests/parser/test_base.py
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3078 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 88 88
Lines 18512 18526 +14
Branches 2138 2140 +2
=========================================
+ Hits 18512 18526 +14
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
ilovelinux
left a comment
There was a problem hiding this comment.
This PR implements cycle detection as it is usually done in graph algorithms (BFS, DFS, etc...).
LGTM! Thank you @kevin-paulson-mindbridge-ai 🙂
ilovelinux
left a comment
There was a problem hiding this comment.
Hi @kevin-paulson-mindbridge-ai, I realized it would be great to add some generation test for schemas with cyclic references in order to track the generated output of this corner-case.
See tests/main/{openapi,jsonschema,graphql}/test_main_*.py for examples. 🙂
…to track the generated output
There was a problem hiding this comment.
🧹 Nitpick comments (2)
tests/main/openapi/test_main_openapi.py (1)
4756-4761: Add an explicit timeout to this hang-regression test.The test for cyclic component reference handling (which previously caused indefinite hangs) has no timeout protection. If this regression returns, CI will stall until a global job timeout. Since
run_main_and_assert()does not enforce a per-test timeout, add one directly to this test using@pytest.mark.timeout()to fast-fail if the code hangs again.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/main/openapi/test_main_openapi.py` around lines 4756 - 4761, This test calling run_main_and_assert(...) for "cyclic_component_refs.yaml" lacks a per-test timeout and can hang CI; add a pytest timeout decorator (use `@pytest.mark.timeout`(seconds)) to the test function that invokes run_main_and_assert for cyclic_component_refs.yaml so the test fails fast if it hangs—place the decorator immediately above the test definition and choose a reasonable timeout (e.g., 10–30s) to prevent indefinite hangs while allowing normal runs.tests/main/graphql/test_main_graphql.py (1)
924-929: Make the expected snapshot explicit in this regression test.On Line 924, this relies on implicit
run_main_and_assertbehavior for expected output resolution. Please passexpected_file="cyclic_mutual_types.py"explicitly to keep the regression assertion unambiguous and resilient to helper changes.Proposed patch
def test_main_cyclic_mutual_types(output_file: Path) -> None: """Mutual object-type cycle (A <-> B) in GraphQL.""" run_main_and_assert( input_path=GRAPHQL_DATA_PATH / "cyclic-mutual-types.graphql", output_path=output_file, input_file_type="graphql", assert_func=assert_file_content, + expected_file="cyclic_mutual_types.py", )🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/main/graphql/test_main_graphql.py` around lines 924 - 929, Update the test call to run_main_and_assert to pass the explicit expected snapshot filename so the regression is unambiguous: modify the invocation of run_main_and_assert (the one using input_path=GRAPHQL_DATA_PATH / "cyclic-mutual-types.graphql", output_path=output_file, input_file_type="graphql", assert_func=assert_file_content) to include expected_file="cyclic_mutual_types.py" as an additional argument.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/main/graphql/test_main_graphql.py`:
- Around line 924-929: Update the test call to run_main_and_assert to pass the
explicit expected snapshot filename so the regression is unambiguous: modify the
invocation of run_main_and_assert (the one using input_path=GRAPHQL_DATA_PATH /
"cyclic-mutual-types.graphql", output_path=output_file,
input_file_type="graphql", assert_func=assert_file_content) to include
expected_file="cyclic_mutual_types.py" as an additional argument.
In `@tests/main/openapi/test_main_openapi.py`:
- Around line 4756-4761: This test calling run_main_and_assert(...) for
"cyclic_component_refs.yaml" lacks a per-test timeout and can hang CI; add a
pytest timeout decorator (use `@pytest.mark.timeout`(seconds)) to the test
function that invokes run_main_and_assert for cyclic_component_refs.yaml so the
test fails fast if it hangs—place the decorator immediately above the test
definition and choose a reasonable timeout (e.g., 10–30s) to prevent indefinite
hangs while allowing normal runs.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 6a41a0a5-bdda-41a2-953a-aae47badee83
⛔ Files ignored due to path filters (3)
tests/data/graphql/cyclic-mutual-types.graphqlis excluded by!tests/data/**/*.graphqland included by nonetests/data/jsonschema/cyclic_mutual_defs.jsonis excluded by!tests/data/**/*.jsonand included by nonetests/data/openapi/cyclic_component_refs.yamlis excluded by!tests/data/**/*.yamland included by none
📒 Files selected for processing (6)
tests/data/expected/main/graphql/cyclic_mutual_types.pytests/data/expected/main/jsonschema/cyclic_mutual_defs.pytests/data/expected/main/openapi/cyclic_component_refs.pytests/main/graphql/test_main_graphql.pytests/main/jsonschema/test_main_jsonschema.pytests/main/openapi/test_main_openapi.py
✅ Files skipped from review due to trivial changes (3)
- tests/data/expected/main/openapi/cyclic_component_refs.py
- tests/data/expected/main/jsonschema/cyclic_mutual_defs.py
- tests/data/expected/main/graphql/cyclic_mutual_types.py
ilovelinux
left a comment
There was a problem hiding this comment.
I locally cherry-picked the last two commits (the ones that introduce the generation tests) to main. I expect them to fail (or hang) but they succeed instead.
Given remote:
$ git remote add kevin-paulson-mindbridge-ai https://github.com/kevin-paulson-mindbridge-ai/datamodel-code-generator.git
$ git fetch kevin-paulson-mindbridge-aiTo replicate. exec from main branch:
$ git cherry-pick 7786cf90867db72f716f1796ab071fefcffe4dc5
$ git cherry-pick 8d3ee3958ce8d9c595fdaa3d46a73bba2b54fa23
$ tox -e py314-parallel -- -k cyclicThis means that the cyclic model dependency added as test was already handled by the tool.
In fact, also the other tests (which implements the same cyclic model dependency corner-case) works on main.
$ git switch --detach kevin-paulson-mindbridge-ai/slow_schema_generation_fix
$ git restore src/datamodel_code_generator/parser/base.py --source main
$ tox -e py314-parallel -- -k cyclic@kevin-paulson-mindbridge-ai could you double-check the corner-case you have in your OpenAPI schema? As now, it looks like the one you proposed was already supported, therefore we cannot reproduce the issue, validate the fix & add regression tests 🙁
…ctually reproduce the issue (don't complete in a reasonable time)
My apologies, I neglected to validate that the new tests reproduced the issue before the change. I have updated the 3 tests such that they reproduce the issue before the change (they each ran at least 30 seconds before I killed them) and after the change they pass quickly (less than a second each). They look OK to me, but I am most familiar with OpenAPI and I am very unfamiliar with graphql. |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tests/data/expected/main/jsonschema/cyclic_mutual_defs.py`:
- Around line 14-19: The generated classes Person and Team create an impossible
mutual inheritance cycle (Person(Team, Model) and Team(Person, Model)) causing a
runtime NameError; replace this mutual inheritance with a valid hierarchy by
removing the forward base reference—e.g., have both classes inherit from Model
only or make one inherit from the other (but not both) so that Person and Team
do not form a direct cycle; update the class headers for Person and Team (the
class definitions named Person and Team) accordingly and ensure any
tests/fixtures referencing Model still import/define Model as before.
In `@tests/data/expected/main/openapi/cyclic_component_refs.py`:
- Around line 14-19: The expected fixture currently creates an unimportable
cyclic base-class relationship between Person and Team (Person(Team, Model) and
Team(Person, Model)); fix it by removing the cyclic inheritance and making both
classes inherit only from Model (e.g., class Person(Model): and class
Team(Model):) and, if a cross-reference is needed, add attributes that reference
the other type as a forward annotation (e.g., person has team: "Team" | None =
None and team has member: "Person" | None = None) so the file can be imported
successfully while preserving the relationship between Person and Team.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: c31b332b-2832-4806-b2b4-639415011af1
⛔ Files ignored due to path filters (3)
tests/data/graphql/cyclic-mutual-types.graphqlis excluded by!tests/data/**/*.graphqland included by nonetests/data/jsonschema/cyclic_mutual_defs.jsonis excluded by!tests/data/**/*.jsonand included by nonetests/data/openapi/cyclic_component_refs.yamlis excluded by!tests/data/**/*.yamland included by none
📒 Files selected for processing (3)
tests/data/expected/main/graphql/cyclic_mutual_types.pytests/data/expected/main/jsonschema/cyclic_mutual_defs.pytests/data/expected/main/openapi/cyclic_component_refs.py
That's not a problem, I appreciate your effort. Reviews & double-checks are useful because of that 🙂
I'll review it as soon as I'll get some free time 🙂
Could you add a timeout as suggested by the first nitpick comment here: #3078 (review)? |
1) Avoid infinite loop due to cyclic inheritance with discriminators 2) De-duplicate union members (but keep same order)
|
I think I've got things much closer now. I'll plan to come back to this to:
|
|
@coderabbitai resume |
✅ Actions performedReviews resumed. |
This is now ready for review. Note that I don't believe graphql has the same issue due to the code paths followed and the nature of graphql. |
The requested changes were for an earlier version of the PR. The current regression cases reproduce the hang on main and pass on this branch, and the previous concerns appear addressed.
|
@kevin-paulson-mindbridge-ai |
Breaking Change AnalysisResult: No breaking changes detected Reasoning: All three changes are bug fixes with no breaking impact: (1) The cycle detection in sort_data_models only triggers for schemas that previously caused an infinite hang — there was no working output to break. (2) The union type hint deduplication changes output from e.g. Union[A, A, B] to Union[A, B], but these are semantically identical in Python — the generated code behaves exactly the same. This only applies in the narrow forward-reference fallback path. (3) The null-reference guard in discriminator handling prevents crashes on base classes without references, with no behavior change for schemas that previously worked. No CLI/API changes, no default behavior changes, no template changes, no Python version changes. This analysis was performed by Claude Code Action |
|
🎉 Released in 0.57.0 This PR is now available in the latest release. See the release notes for details. |
We've been generating a model with the tool based on our OpenAPI schema, however since version 0.37.0 we've been unsuccessful in doing so. In the latest release (0.56.0) the generation was revealed to be stuck in
sort_data_models. I waited up to an hour, I don't think it was ever going to complete. With this change it now works as expected.I tried to follow https://datamodel-code-generator.koxudaxi.dev/development-contributing/ as best as I can, but let me know if you would like me to make any changes or if I should create an issue or anything else. Thanks!
Summary by CodeRabbit
Bug Fixes
Tests
Chores